Political News Sentiment Analysis for Under-resourced Languages
نویسندگان
چکیده
This paper presents classification results for the analysis of sentiment in political news articles. The domain of political news is particularly challenging, as journalists are presumably objective, whilst at the same time opinions can be subtly expressed. To deal with this challenge, in this work we conduct a two-step classification model, distinguishing first subjective and second positive and negative sentiment texts. More specifically, we propose a shallow machine learning approach where only minimal features are needed to train the classifier, including sentiment-bearing CoOccurring Terms (COTs) and negation words. This approach yields close to state-of-the-art results. Contrary to results in other domains, the use of negations as features does not have a positive impact in the evaluation results. This method is particularly suited for languages that suffer from a lack of resources, such as sentiment lexicons or parsers, and for those systems that need to function in real-time.
منابع مشابه
Benchmarking Machine Translated Sentiment Analysis for Arabic Tweets
Traditional approaches to Sentiment Analysis (SA) rely on large annotated data sets or wide-coverage sentiment lexica, and as such often perform poorly on under-resourced languages. This paper presents empirical evidence of an efficient SA approach using freely available machine translation (MT) systems to translate Arabic tweets to English, which we then label for sentiment using a state-of-th...
متن کاملContrastive Analysis of Political News Headlines Translation According to Berman’s Deformative Forces
The present research aimed at investigating the deformation of political news headlines translation between English and Persian News Agencies based on Berman`s deformative system. For this purpose, 100 news headlines in English were selected from BBC, Reuters, Associated Press, France, France 24, Financial Times, Business Times, New York Times, Politico, Guardian, CNN, Bloomberg, Middle East Ey...
متن کاملMultiBooked: A Corpus of Basque and Catalan Hotel Reviews Annotated for Aspect-level Sentiment Classification
While sentiment analysis has become an established field in the NLP community, research into languages other than English has been hindered by the lack of resources. Although much research in multi-lingual and cross-lingual sentiment analysis has focused on unsupervised or semi-supervised approaches, these still require a large number of resources and do not reach the performance of supervised ...
متن کاملSentiment Classification in Under-Resourced Languages Using Graph-Based Semi-Supervised Learning Methods
In sentiment classification, conventional supervised approaches heavily rely on a large amount of linguistic resources, which are costly to obtain for under-resourced languages. To overcome this scarce resource problem, there exist several methods that exploit graph-based semisupervised learning (SSL). However, fundamental issues such as controlling label propagation, choosing the initial seeds...
متن کامل